Content
- Brief Introduction
- Details About Data
- Data Sources
- Data Fields
- Main Technologies Used
- Data Transformation
- Main Libraries
- Data Wrangling
- Shiny Application
- Structure of ui
- Structure of server
- Tables
- Graphs
- Deployment
- References
1. Brief Introduction
This dashboard is built in R Shiny and used R Markdown to publish on this website. This is the documentation of web application that describes technical details of how the application was built. How was data cleaned and organized to load for the visualizations of this dashboard.
How to see information on the website:
2. Details About Data
Details about the dataset can be found here: data.gov.sg
3. Main Technologies Used
4. Data Transformation
Please refer to DataWrangling.R file for more details.
Main Libraries
First, we import the dataset. We are using DT package to display an interactive table that fits into the page.
Data Fields
data <- read.csv('data/employment_data.csv')
datatable(data, rownames = FALSE, filter="top", class = "table", options = list(pageLength = 5, scrollX=T) )Check the dimension of the dataframe.
## [1] 703 12
First step we convert the following variables from factor into numeric:
- employment_rate_overall
- employment_rate_ft_perm
- basic_monthly_mean
- basic_monthly_median
- gross_monthly_mean
- gross_monthly_median
- gross_mthly_25_percentile
- gross_mthly_75_percentile
Data Wrangling
## 'data.frame': 703 obs. of 12 variables:
## $ year : int 2013 2013 2013 2013 2013 2013 2013 2013 2013 2013 ...
## $ university : chr "Nanyang Technological University" "Nanyang Technological University" "Nanyang Technological University" "Nanyang Technological University" ...
## $ school : chr "College of Business (Nanyang Business School)" "College of Business (Nanyang Business School)" "College of Business (Nanyang Business School)" "College of Business (Nanyang Business School)" ...
## $ degree : chr "Accountancy and Business" "Accountancy (3-yr direct Honours Programme)" "Business (3-yr direct Honours Programme)" "Business and Computing" ...
## $ employment_rate_overall : num 97.4 97.1 90.9 87.5 95.3 81.3 87.3 90.3 94.8 92.1 ...
## $ employment_rate_ft_perm : num 96.1 95.7 85.7 87.5 95.3 68.8 85.1 88.2 93.8 88.5 ...
## $ basic_monthly_mean : num 3701 2850 3053 3557 3494 ...
## $ basic_monthly_median : num 3200 2700 3000 3400 3500 2900 3000 3100 3000 3000 ...
## $ gross_monthly_mean : num 3727 2938 3214 3615 3536 ...
## $ gross_monthly_median : num 3350 2700 3000 3400 3500 ...
## $ gross_mthly_25_percentile: num 2900 2700 2700 3000 3100 ...
## $ gross_mthly_75_percentile: num 4000 2900 3500 4100 3816 ...
Check how many missing values in each column.
| x | |
|---|---|
| year | 0 |
| university | 0 |
| school | 0 |
| degree | 0 |
| employment_rate_overall | 73 |
| employment_rate_ft_perm | 73 |
| basic_monthly_mean | 73 |
| basic_monthly_median | 73 |
| gross_monthly_mean | 73 |
| gross_monthly_median | 73 |
| gross_mthly_25_percentile | 73 |
| gross_mthly_75_percentile | 73 |
Removing the rows that contain missing values, and check the dimension again.
## [1] 630 12
Let’s take a look at how many records we have for each university.
| university | count |
|---|---|
| Nanyang Technological University | 204 |
| National University of Singapore | 207 |
| Singapore Institute of Technology | 135 |
| Singapore Management University | 72 |
| Singapore University of Social Sciences | 3 |
| Singapore University of Technology and Design | 9 |
Let’s compare the median monthly income between universities. From the graph we can see that on average, Singapore University of Technology and Design(SUTD) graduates have a better salary (however we have only 9 records so this is likely to be biased), then comes Singapore Management University. The top universities, National University of Singapore and Nanyang Technological University graduates are at 3rd and 4th places, respectively, among 6 universities,
p <- ggplot(data, aes(x=university, y=basic_monthly_median)) +
geom_boxplot(fill="steelblue", alpha=0.5) +
xlab("University") + ylab("Basic Montly Median of Graduates")
p + coord_flip()Let’s dive deeper into the details of Nanyang Technological University and list the basic monthly median income for each school. And we can see from the plot that graduates from College of Arts barely get over 3k monthly income. This is consistant with our experiences that usually students from arts or social sciences schools get less pay than their colleagues with engineering or business background.
monthlyNTU <- data %>%
filter(university=="Nanyang Technological University") %>%
group_by(school) %>%
summarise_at(.vars = names(.)[7:8],.funs = c(mean="mean"))
p <- ggplot(data.frame(monthlyNTU), aes(x=reorder(school,basic_monthly_median_mean),
y=basic_monthly_median_mean)) +
geom_bar(stat="identity", fill="steelblue", alpha=0.5) +
xlab("Schools") + ylab("Basic Montly Median of Graduates from NTU")
p + coord_flip() The following graph shows the density of the employment rate for each university.
ggplot(data, aes(x=employment_rate_ft_perm)) + geom_density(aes(colour = university)) +
xlab("Employment rate for Permenant Positions")5. Shiny Application
Structure of ui
Please refer to ui.R file for more details. Description of layout.
Structure of server
Please refer to server.R file for more details.
Tables
Graphs
6. Deployment
Here we will describe a few steps how we launched on webserver.